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Abstract 

This paper deals with the problem of estimating the finite population mean when some 
information on two auxiliary attributes are available. A class of estimators is defined which 
includes the estimators recently proposed by Malik and Singh (2012), Naik and Gupta (1996) 
and Singh et al. (2007) as particular cases. It is shown that the proposed estimator is more 
efficient than the usual mean estimator and other existing estimators. The study is also 
extended to two-phase sampling. The results have been illustrated numerically by taking 
empirical population considered in the literature. 

Keywords Simple random sampling, two-phase sampling, auxiliary attribute, point bi- 
serial correlation, phi correlation, efficiency. 



1. Introduction 

There are some situations when in place of one auxiliary attribute, we have 
information on two qualitative variables. For illustration, to estimate the hourly wages we can 
use the information on marital status and region of residence (see Gujrati and Sangeetha 
(2007), page-311). Here we assume that both auxiliary attributes have significant point bi- 
serial correlation with the study variable and there is significant phi-correlation (see Yule 
(1912)) between the auxiliary attributes. The use of auxiliary information can increase the 
precision of an estimator when study variable Y is highly correlated with auxiliary variables 
X. In survey sampling, auxiliary variables are present in form of ratio scale variables (e.g. 
income, output, prices, costs, height and temperature) but sometimes may present in the form 
of qualitative or nominal scale such as sex, race, color, religion, nationality and geographical 
region. For example, female workers are found to earn less than their male counterparts do or 
non-white workers are found to earn less than whites (see Gujrati and Sangeetha (2007), page 
304). Naik and Gupta (1996) introduced a ratio estimator when the study variable and the 
auxiliary attribute are positively correlated. Jhajj et al. (2006) suggested a family of 
estimators for the population mean in single and two-phase sampling when the study variable 
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and auxiliary attribute are positively correlated. Shabbir and Gupta (2007), Singh et al. 
(2008), Singh et al. (2010) and Abd-Elfattah et al. (2010) have considered the problem of 
estimating population mean Y taking into consideration the point biserial correlation 
between auxiliary attribute and study variable. 



2. Some Estimators in Literature 

In order to have an estimate of the study variable y, assuming the knowledge of the 
population proportion P, Naik and Gupta (1996) and Singh et al. (2007) respectively, 
proposed following estimators: 
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The Bias and MSE expression’s of the estimator’s L (i=l, 2, 3, 4) up to the first order of 
approximation are, respectively, given by 
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MSE(t J ) = Y 2 f 1 C?+C;fi-K 



( 2 . 11 ) 
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Malik and Singh (2012) proposed estimators ts and tr, as 
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where a,,a 2 ,P l andP 2 are real constants. 

The Bias and MSE expression’s of the estimator’s t 5 and t 6 up to the first order of 
approximation are, respectively, given by 



B (t5 ) — YP 1 C P| -~ _ + ^““ _a l k pb l + C p, “Y“ + UT -a2k P b 2 + a l a 2 k <|> 



(2.15) 



11 




Rajesh Singh ■ Florentin Smarandache (editors) 
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3. The Suggested Class of Estimators 

Using linear combination of t ; (i = 0,1,2), we define an estimator of the form 
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where w ; (i = 0,1,2) denotes the constants used for reducing the bias in the class of 
estimators, H denotes the set of those estimators that can be constructed from t, (i =0,1,2) 
and R denotes the set of real numbers (for detail see Singh et. al (2008)). Also, 
Lj(i = 1,2,...,8) are either real numbers or the functions of the known parameters of the 
auxiliary attributes. 

Expressing t p in terms of e’s, we have 
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L,P, + L-, 
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After expanding. Subtracting Y from both sides of the equation (3.3) and neglecting the term 
having power greater than two, we have 

(tp -Y)=Y[e 0 -w 1 (a 1 cp 1 e 1 +a 2 cp 2 e 2 )-w 2 (p 1 0 1 e 1 -p 2 0 2 e 2 )] 



Squaring both sides of (3.4) and then taking expectations, we get MSE of the estimator t up 
to the first order of approximation, as 
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4. Empirical Study 

Data: (Source: Government of Pakistan (2004)) 

The population consists rice cultivation areas in 73 districts of Pakistan. The variables 
are defined as: 

Y= rice production (in 000’ tonnes, with one tonne = 0.984 ton) during 2003, 

P| = production of farms where rice production is more than 20 tonnes during the year 2002. and 
P 2 = proportion of farms with rice cultivation area more than 20 ha during the year 2003. 

For this data, we have 

N=73, Y =61.3, P, =0.4247, P 2 =0.3425, Sj=12371.4, =0.225490, S; =0.228311, 

Ppb,= 0 - 621 > P P b 2 =°- 673 > P*=0-889. 

Table 4.1: PRE of different estimators of Y with respect to y . 
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5. Double Sampling 

It is assumed that the population proportion Pi for the first auxiliary attribute <j)| is 
unknown but the same is known for the second auxiliary attribute 4>2 • When Pi is unknown, it 
is some times estimated from a preliminary large sample of size n'on which only the 
attribute (j)jis measured. Then a second phase sample of size n (n<n') is drawn and Y is 
observed. 

Let p'=-i;Mj=i- 2 >- 

n i=i 

The estimator’s ti, t 2 , t 3 and U in two-phase sampling take the following form 
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t d3 =Y ex P 



Pi ~Pi 
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t d4 = y ex P 
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The bias and MSE expressions of the estimators tdi, td 2 , td 3 and td 4 up to first order of 
approximation, are respectively given as 
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The estimator’s U and t6, in two-phase sampling, takes the following form 
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Where mj , m 7 , n, and n, are real constants. 



The Bias and MSE expression’s of the estimator’s t d5 and t d6 up to the first order of 
approximation are, respectively, given by 
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6. Estimator t p d in Two-Phase Sampling 

Using linear combination of t di (i = 0,1,2), we define an estimator of the form 

3 
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where h ; (i = 0,1,2) denotes the constants used for reducing the bias in the class of estimators, 
H denotes the set of those estimators that can be constructed from t di (i = 0,1,2) and R 
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denotes the set of real numbers (for detail see Singh et. al. (2008)). Also, L,(i = 1,2,...,8) are 
either real numbers or the functions of the known parameters of the auxiliary attributes. 

Expressing t p d in terms of e’s, we have 

t p =Y(l + e 0 )[h 0 +h 1 (l + (p 1 e' 1 ) mi (l + (p 1 e 1 )" ni (l + (p 2 e' 2 )" m2 

+ h 2 exp(e, [e', -e, ][l + 0,(6'! -e, )] 1 f exp(0 2 e' 2 [l + 0 2 e' 2 ])" 2 (6 3) 

After expanding, subtracting Y from both sides of the equation (6.3) and neglecting the 
terms having power greater than two, we have 

(tpd - y)= Y[e 0 +h,(m 1 (p 1 e' 1 -m 1 (p 1 e 1 -m 2 (p 2 e' 2 ) + h 2 (n 1 0 1 e' 1 -n 1 0,e 1 +n 2 0 2 e' 2 )] 

(6.4) 



Squaring both sides of (6.4) and then taking expectations, we get MSE of the estimator t p up 
to the first order of approximation, as 
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Data: (Source: Singh and Chaudhary (1986), p. 177). 



The population consists of 34 wheat farms in 34 villages in certain region of India. The 
variables are defined as: 



y = area under wheat crop (in acres) during 1974. 

p, = proportion of farms under wheat crop which have more than 500 acres land during 1971. 
and 
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p 0 = proportion of farms under wheat crop which have more than 100 acres land during 1973. 
For this data, we have 

N=34, Y =199.4, P, =0.6765, P 2 =0.7353, S 2 y =22564.6, =0.225490, S; =0.200535, 

Ppb^ 0599 ’ P P b 2 =°- 559 > P*=0-725. 



Table 6.1: PRE of different estimators of Y with respect to y 
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7 . Conclusion 

In this paper, we have suggested a class of estimators in single and two-phase 
sampling by using point bi serial correlation and phi correlation coefficient. From Table 4.1 
and Table 6.1, we observe that the proposed estimator t p and t p d performs better than other 
estimators considered in this paper. 
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